Statistical Mechanics

2 Probability

Definition

A probability space consists of a sample space Ω, such that for (some of the) events EΩ, we assign a probability P(E)[0,1] so that

  • P(Ω)=1, and
  • if E1E2= are disjoint events, then P(E1E2)=P(E1)+P(E2).

basically the usual countable additivity

Definition

The objective probability obtained experimentally from the relative frequency of an occurrence in many tests of the random variable

limnnAn

where n is the number of times a random process is repeated and nA represents the number of times event A occurs in those n repetitions. Clearly the bigger the n the more accurate

Definition

A real random variable is a map X:ΩR, which can also be defined by a cumulative distribution function

F(x)=P({ωΩ:X(ω)x})=P(Xx).
Definition

The expectation value of a function g(X), denoted g(X), is

g(X)=p(x)g(x)dx.

One specific example is the moments of a random variable X, which are

Xm=p(x)xmdx.
Definition

The characteristic function of a random variable X is

p~(k)eikX=p(x)eikxdx.

and because this is a fourier transform we may obtain the distribution function p(x) via fourier inversion theorem

p(x)=12πp~(k)eikxdk

now doing a taylor expansion on the characteristic function

The Taylor series of ρ~(k) about k=0 is:

ρ~(k)=m=0kmm!dmρ~(k)dkm|k=0.

We need to compute the derivatives dmρ~(k)dkm.

Differentiate under the integral (justified for well-behaved ρ(x)):

dmρ~(k)dkm=dmdkmρ(x)eikxdx=ρ(x)mkm(eikx)dx.

The partial derivative is:

mkm(eikx)=(ix)meikx.

So:

dmρ~(k)dkm=ρ(x)(ix)meikxdx.

Evaluate at k=0:

dmρ~(k)dkm|k=0=ρ(x)(ix)mdx=(i)mρ(x)xmdx=(i)mXm,

where Xm=ρ(x)xmdx is the m-th moment. So we have

ρ~(k)=m=0kmm!(i)mXm=m=0(ik)mm!Xm.
Definition

the cumulant generating function is the logarithm of the characteristic function.

lnp~(k)=n=1(ik)nn!xnc.

Proposition

We have

xm={pn}m!n1pn!(n!)pnxncpn.

and that we may see the nth cumulant as a connected cluster of n points and the mth moment as the sum of all subdivision of m points into partitions of smaller clusters

first by definition we have

m=0(ik)mm!xm=exp[n=1(ik)nn!xnc]

we may rewrite the RHS as

exp[n=1(ik)nn!xnc]=n=1exp((ik)nn!xnc)

for each n expand the indivdual exponential using its taylor series

exp((ik)nn!xnc)=pn=01pn!((ik)nn!xnc)pn

simplify to get for each n

1pn!((ik)nn!xnc)pn=(ik)npnpn!(xncn!)pn=(ik)npnxncpnpn! (n!)pn

so altogether we get

m=0(ik)mm!xm=n=1[pn=0(ik)npnxncpnpn!(n!)pn]

now we equate the coefficients of (ik)m in which case for each m the contribution by the LHS and RHS respectively is

xmm!={pn}:npn=mnxncpnpn!(n!)pn

rearranging this gives the desired result

now {pn:npn=m} simply counts the number of ways to break m points into {pn} clusters/partitions of n points where each pn value represents the multiplicity of a partition of size n.

Remark

  • Above it also serves to select chosen values of pn for each n. Then n simply multiplies these pre-selected set {pn}
  • next notice that for each chosen way of partitioning we divide by pn! and (n!)pn as we don't want to repeat count clusters and the balls in each cluster

Example

graphically
../../../Attachments/Pasted image 20250721054041.png
corresponds to

x=xc,x2=x2c+xc2,x3=x3c+3x2cxc+xc3,x4=x4c+4x3cxc+3x2c2+6x2cxc2+xc4.

2.3 Some Important Probability Distributions

Definition

the normal(gaussian) distribution describes a continuous real random variable x with

p(x)=12πσ2exp[(xλ)22σ2].
Proposition

the corresponding characteristic function is then

p~(k)=dx12πσ2exp[(xλ)22σ2ikx]=exp[ikλk2σ22].

proof: essentially group all x terms into a quadratic form so that we may apply the standard gaussian integral result. Specifically we first expand the exponent

x22σ2+λxσ2λ22σ2ikx=x22σ2+(λσ2ik)xλ22σ2

then letting the terms in the paranthesis be b we rearrange to get

12σ2(x22σ2bx)λ22σ2

basically with the goal of separating the x from the constants in the exponent. Then finally to get the gaussian integral form we need a quadratic factorization of x so we do complete the square in the paranthesis to get

12σ2[(xσ2b)2(σ2b)2]λ22σ2=(xσ2b)22σ2+σ2b22λ22σ2.

we sub back b into the constant part

σ2b22λ22σ2=λ22σ2ikλk2σ22λ22σ2=ikλk2σ22.

and then separate out x and the constants in the exponent as desired

ρ~(k)=12πσ2exp(ikλk2σ22)exp[(xσ2b)22σ2]dx.

as planned we recognize the gaussian integral form in the integral on the right and therefore we have

ρ~(k)=12πσ2exp(ikλk2σ22)2πσ2=exp(ikλk2σ22)

and so cumulants are in the form

lnp~(k)=ikλk2σ2/2,

so it is clear by comparing with the definition above that

xc=λ,x2c=σ2,x3c=x4c==0

in which case calcuation of moments from the culmulants using the previous proposition is just

x=λ,x2=σ2+λ2,x3=3σ2λ+λ3,x4=3σ4+6σ2λ2+λ4,

Definition

the binomial distribution consider a random variable with two outcomes A and B of relative probabilities pA and pB=1pA then the probability A occurs NA times in N trials is

pN(NA)=(NNA)pANApBNNA

note that we have

(NNA)=N!NA!(NNA)!

then our characteristic function is given by(the 3rd equality follows because the LHS is simply the binomial expansion of the RHS)

p~N(k)=eikNA=NA=0NN!NA!(NNA)!pANApBNNAeikNA=(pAeik+pB)N.

and consequently our cumulent generating function is

lnp~N(k)=Nln(pAeik+pB)=Nlnp~1(k),
Definition

the poisson distribution

2.4 Many Random Variables

Definition

the **joint PDF ** p(x) is the probability density of an outcome in a volume element dNx=i=1Ndxi around the point x={x1,x2,,xN}

in other words, letting X be a vector of random variables:

p(x)=limi,dxi0P(X is in the box around x of width idxi)dx1dx2dxN.

the joint PDF is normalized such that

px(S)=dNxp(x)=1

if and only if the N random variables are independent the joint PDF is then the product of individual PDFs

p(x)=i=1Npi(xi)
Definition

the unconditional PDF describes the subset of random variables independent of the values of others. For example you are only interested in the first m variables of the N total variables then:

p(x1,...,xm)=i=m+1Ndxip(x1,...,xN).

where effectively we have integrated over all the other non-relevant variables for each {x1,,xm}.

Observe that now our PDF does not depend on the variables {xm+1,,xN} as they are automatically included for any {x1,,xm}. For example say we have position vector (x,y,z) and we are only interested in (x). The unconditional PDF simply includes all probability density contributions over (y,z) for each x so our PDF is independent of y,z

Definition

the conditional PDF describes the behavior of a subset of random variables for specified values of others. For example consider p(x,v)/N=p(v|x) where p(x,v) is the joint PDF and that p(v|x) is the conditional PDF for velocity v at a given fixed position x

note we have N, a normalization factor that ensures

 p(x,v)/Nd3v= p(v|x)d3v=1

so we have

N=d3vp(x,v)=p(x),

where the final equality follows because the 2nd expression is simply the definition of unconditional PDF as defined just previously!

Proposition

Essentially we have just shown that Bayes' Theorem

p(x1,,xm|xm+1,,xN)=p(x1,,xN)p(xm+1,,xN).
Definition

the expectation value of a function F(x) is obtained as before from

F(x)=dNxp(x)F(x).

and the joint characteristic function will then be the N-dimensional Fourier Transformationn of the joint PDF

p~(k)=exp(ij=1Nkjxj).=p(x1,,xN)exp(ij=1Nkjxj)dx1dxN.

like before we do a taylor expression but this time for a multivariate case

exp(ij=1Nkjxj)=m=01m!(ij=1Nkjxj)m

we may now apply the multinomial expansion given by

(j=1Nkjxj)m=n1++nN=mm!n1!nN!(k1x1)n1(kNxN)nN,

where we may sub into our expression to get

exp(ij=1Nkjxj)=m=0(i)mm!n1++nN=mm!n1!nN!(k1x1)n1(kNxN)nN.=m=0n1++nN=m(i)mn1!nN!(k1x1)n1(kNxN)nN.

with this we may rewrite our characteristic function p~(k) like so

p~(k)=E[m=0n1++nN=m(i)mn1!nN!(k1x1)n1(kNxN)nN]=m=0n1++nN=m(i)mn1!nN!k1n1kNnNE[x1n1xNnN].

where E[x1n1xNnN]=x1n1xNnN (inside is a product btw). With these the following should make sense

Example

consider

(ik1)p~(k)|k=0=p(x1,,xN)exp(ij=1Nkjxj)x1dx1dxN=x1|k=0=x1p(x1,,xN)1dx1dxN=x1,
Example

and consider in general

x1n1x2n2xNnN=[(ik1)]n1[(ik2)]n2[(ikN)]nNp~(k=0),

similarly for cumlants we have

X1m1X2m2Xnmnc=((ik1))m1((ikn))mn(lnρ~(k))|k=0.

which should make sense if you recall how cumulants are defined.

Example

The same "points in bags" argument for relating cumulants and moments works here: if we want to put two 1s and one 2 into bags, the different configurations are (112), two ways for (1)(12), one way for (2)(11), and one way for (1)(1)(2), so

X12X2=X12X2c+2X1cX1X2c+X2cX12c+X1c2X2c.
Proposition

The joint gaussian distribution in N dimensions is given by

p(x)=1(2π)Ndet[C]exp[12mn(C1)mn(xmλm)(xnλn)]

Proof: first recall the univariate gaussian distribution but this time suppose we have N independent univariate guassian random variables x1,x2,,xN each with its own mean λjand variance σj2. The PDF for each xj is then:

pj(xj)=12πσj2exp[(xjλj)22σj2]

since they are independent we have for x=(x1,,xN)T

p(x)=j=1Npj(xj)=j=1N12πσj2exp[(xjλj)22σj2]

which we may simplify to

p(x)=1(2π)N/2j=1Nσj2exp[12j=1N(xjλj)2σj2]

in matrix form we may write

12j=1N(xjλj)2σj2=12(xλ)TC1(xλ),

where C=diag(σ12,σ22,,σN2) and λ=(λ1,,λN)T. Therefore we may also rewrite this as

p(x)=1(2π)Ndet[C]exp[12(xλ)TC1(xλ)]

as for the characteristic function first recall that

p~(k)=p(x)exp(ij=1Nkjxj)dx1dxN

so substituting the join PDF we obtain

p~(k)=[j=1Npj(xj)]exp(ij=1Nkjxj)dx=j=1N[pj(xj)eikjxjdxj]=j=1Np~j(kj)

now for each p~j(kj) recall from earlier we should then get

p~(k)=j=1Nexp[ikjλj12σj2kj2]=exp[ij=1Nkjλj12j=1Nσj2kj2]

finally to get the matrix form we first define the mean vector λ=(λ1,...,λN)T and wavevector k=(k1,...,kN)T. The linear term is

j=1Nkjλj=kTλ.

For the quadratic term, since variables are independent, the covariance matrix C is
diagonal: C=diag(σ12,...,σN2),so:

j=1Nσj2kj2=m,n=1NCmnkmkn=kTCk,

(with off-diagonals zero). Thus:

p~(k)=exp[ikTλ12kTCk]

comparing with the univariate case that we will have

xmc=λm,xmxnc=Cmn,
Theorem

Wick's theorem says that suppose we have a multivariate Guassian with λ=0 then

X1m1Xnmn={0mi is oddsum over all pairwise contractionsotherwise

proof: study quantum field theory first...for now just assume this

We now will like to consider functions of random variables. First consider

abpY(y)dy=dxpX(x)1[a,b](g(x))

where Y=g(X) and 1[a,b] is the indicator function that returns if g(x) is in the range [a,b] and 0 otherwise. We can rewrite this as

=dxpX(x)abδ(g(x)y)dy

now because a,b is arbitrary then have

pY(y)=dxpX(x)δ(g(x)y).

with this relation we may easily generalize to multi-dimensions like so

ρY(y)=(idxi)ρ(x1,,xn)δ(g(x1,,xn)y).
Example

Let Y=X12+X22, where X1,X2 are independent random variables.

Then we can write the probability distribution function as

pY(y)=dx1dx2p1(x1)p2(x2)δ(x12+x22y),

and this can be simplified most easily by using a (polar) change of variables: set

r2=x12+x22dx1dx2=rdrdθ=12dθd(r2),

so that x1=rcosθ and x2=rsinθ. Then

pY(y)=12dθd(r2)p1(rcosθ)p2(rsinθ)δ(r2y),

and now we can plug in r2=yr=y (in polar coordinates, r is always nonnegative) wherever it appears to get

pY(y)=02πdθ2p1(ycosθ)p2(ysinθ),

and we've removed the delta function from the expression.

3 Kinetic Theory of gases

Kinetic theory studies the macroscopic properties of large numbers of particles, starting from their (classical) equations of motion.

First we consider how to define "equilibrium" for a system of particles. Consider a dilute(nearly ideal) gas.

At any time t, the microstate of a system of N particles is described by specifying the positions qi(t) and momenta pi(t) of all particles.

The micostate corresponds to a point μ(t) in the 6-N dimensional phase space Γ=i=1N{qi,pi}

Fact

Thee time evolution of this point is governed by the canonical equations

{dqidt=Hpidpidt=Hqi

where the hamiltonian H(p,q) describes the total energy in terms of the set of coordinates q{q1,q2,,qN} and momenta p{p1,p2,,pN}

Now as formulated within thermodynamics the macrostate M of an ideal gas in equilibrium is described by a small number of state functions such as E,T,P,N.

Fact

Many different microstates can represent the same macrostate(i.e a many to one relationship).This is because particles can be arranged in countless ways (positions and velocities) while still giving the same average properties like temperature or pressure

Consider N copies of a particular macrostate each described by a different representative point μ(t) in the phase space Γ.

Definition

A phase space density ρ(p,q,t) is defined by

ρ(p,q,t)dΓ=limNdN(p,q,t)N

to make sense of this essentially there are N microstates in the phase space Γ. There are dN microstates contained the infinitesimal area dΓ around the point (p,q). Therefore limNdNN represents the objective probability(defined above)

knowing this it is then clear that we must have dΓρ=1 when integrating over the whole phase space Γ for ρ to be a properly normalized probability density function. With these we now define

Definition

Essemble averages for an arbitrary function O(p,q)

O=dΓρ(p,q,t)O(p,q)
Definition

When the exact microstate μ is specified the system is said to be in pure state

On the other hand when our knowledge of the system is probabilistic in a sense that it is take from an density with density ρ(Γ) it is said to be in a mixed state

3.2 Liouville's Theorem

Theorem

Liouville's theorem states that the phase space density ρ(Γ,t) behaves like an incompressible fluid

First consider

../../../Attachments/Pasted image 20250717010847.png

In the time interval δt we have (p,q)(p,q) like so

qα=qα+q˙αδt+O(δt2),pα=pα+p˙αδt+O(δt2).

which is essentially a taylor expansion 1st order.

Now consider the case for q first. Let 2 points A,B separated by dqα. Then take their time evolutions after δt:

qα(B)=(qα+dqα)+q˙α(qα+dqα)δt+O(δt2).

Since dqα is small, expand q˙α around qα:

q˙α(qα+dqα)=q˙α(qα)+q˙αqαdqα+O(dqα2).

Plug this into Point B's evolution:

qα(B)=qα+dqα+[q˙α(qα)+q˙αqαdqα]δt+O(δt2).dqα=qα(B)qα(A).

Substitute the expressions:

dqα=[qα+dqα+q˙α(qα)δt+q˙αqαdqαδt+O(δt2)][qα+q˙α(qα)δt+O(δt2)].

Simplify and doing everything we have done so far for qα for pα we get

{dqα=dqα+q˙αqαdqαδt+O(δt2)dpα=dpα+p˙αpαdpαδt+O(δt2).

we note that dΓ=i=1Nd3pid3qi but we have

dqαdpα=dqαdpα[1+(q˙αqα+p˙αpα)δt+O(δt2)].

However the time evolution of coordinates and momenta are governed by canonical equations where we have

q˙αqα=qαHpα=2Hpαqα, and p˙αpα=pα(Hqα)=2Hqαpα.

therefore

dqαdpα1

that is dΓ=dΓ or that in δt we had (p,q)(p,q) but the volume that dN occupies is the same

Fact

more precisely ρ behaves like the density of an incompressible fluid

The incompressibility condition ρ(p,q,t+δt)=ρ(p,q,t) can be written in differential form as

dρdt=ρt+α=13N(ρpαdραdt+ρqαdqαdt)=0.

now subbing above to this equation we obtain

ρt=α=13N(ρpαHqαρqαHpα)={ρ,H},

in equilibrium we should have tρeq=0 so from above this means we have {ρeq,H}=0

Definition

The poisson bracket is defined as

{A,B}α=13N(AqαBpαApαBqα)={B,A}.

we used to rewrite our incompressiblity relation as above

Question

so what are the consequences of liouville's theorem ?

4. Classical Statistical Mechanics

4.2 The microcanonical ensemble